Skip to content

A small and fast C++ neural network inference engine

License

Notifications You must be signed in to change notification settings

andreanobile/forward

Repository files navigation

forward

a small and fast C++ neural network inference engine

On my laptop I get over 1.6x speedup using this code (No MKL) compared to Caffe+MKL, both using 1 core. This code scales almost lienearly with the number of cores by processing each image in a separate thread. The amount of memory used is also very different, almost 4x less!

Caffe (resnet-50) used from python on the famous kitten picture: Maximum resident set size (kbytes): 487384

forward on the same picture: Maximum resident set size (kbytes): 136900

A package containing extracted networks to test this program is available here: https://drive.google.com/open?id=1QWEXEsZO6OhqolNyzhf2AXxn_07jHnlt

The resnet network in places_r152 was taken from https://github.com/CSAILVision/places365 (their licence applies)

The resnet networks (r50 r101 r152) are trained on imagenet by MSRA
https://github.com/KaimingHe/deep-residual-networks (their licence applies)

To run the program a HASWELL (supporting AVX, fma) or newer cpu is needed. A Linux system is also needed. The code depends on turbojpeg and libgd to load and rescale jpeg images.

BUILD:

$ mkdir build
$ cd build
$ cmake -DCMAKE_BUILD_TYPE=Release ..
$ make
$ cd ..

in the source directory use cmake. USAGE:

download the extracted networks from the googledrive link. The test program looks into ../networks . To run with the places 365 network on your pictures contained in the directory ~/Pictures :

$ ./nnlib_test places ~/Pictures/*.jpg

To run the resnet-50 network trained on imagenet:

$ ./nnlib_test r50  ~/Pictures/*.jpg

resnet-101

$ ./nnlib_test r101  ~/Pictures/*.jpg

resnet-152

$ ./nnlib_test r152  ~/Pictures/*.jpg

Releases

No releases published

Packages

No packages published